An Algorithm for Iterative Selection of Blocks of Features
نویسنده
چکیده
We focus on the problem of linear regression estimation in high dimension, when the parameter β is "sparse" (most of its coordinates are 0) and "blocky" (βi and βi+1 are likely to be equal). Recently, some authors defined estimators taking into account this information, such as the Fused-LASSO [19] or the S-LASSO [10] among others. However, there are no theoretical results about the obtained estimators in the general design matrix case. Here, we propose an alternative point of view, based on the Iterative Feature Selection method [1]. We propose an iterative algorithm that takes into account the fact that β is sparse and blocky, with no prior knowledge on the position of the blocks. Moreover, we give a theoretical result that ensures that every step of our algorithm actually improves the statistical performance of the obtained estimator. We provide some simulations, where our method outperforms LASSOtype methods in the cases where the parameter is sparse and blocky. Moreover, we give an application to real data (CGH arrays), that shows that our estimator can be used on large datasets.
منابع مشابه
Iterative Approach for Automatic Beam Angle Selection in Intensity Modulated Radiation Therapy Planning
Introduction: Beam-angle optimization (BAO) is a computationally intensive problem for a number of reasons. First, the search space of the solutions is huge, requiring enumeration of all possible beam orientation combinations. For example, when choosing 4 angles out of 36 candidate beam angles, C36 = 58905 possible combinations exist. Second, any change in a beam 4 config...
متن کاملOnline Streaming Feature Selection Using Geometric Series of the Adjacency Matrix of Features
Feature Selection (FS) is an important pre-processing step in machine learning and data mining. All the traditional feature selection methods assume that the entire feature space is available from the beginning. However, online streaming features (OSF) are an integral part of many real-world applications. In OSF, the number of training examples is fixed while the number of features grows with t...
متن کاملNetwork Planning Using Iterative Improvement Methods and Heuristic Techniques
The problem of minimum-cost expansion of power transmission network is formulated as a genetic algorithm with the cost of new lines and security constraints and Kirchhoff’s Law at each bus bar included. A genetic algorithm (GA) is a search or optimization algorithm based on the mechanics of natural selection and genetics. An applied example is presented. The results from a set of tests carried ...
متن کاملA Parallel Genetic Algorithm Based Method for Feature Subset Selection in Intrusion Detection Systems
Intrusion detection systems are designed to provide security in computer networks, so that if the attacker crosses other security devices, they can detect and prevent the attack process. One of the most essential challenges in designing these systems is the so called curse of dimensionality. Therefore, in order to obtain satisfactory performance in these systems we have to take advantage of app...
متن کاملA Parallel Genetic Algorithm Based Method for Feature Subset Selection in Intrusion Detection Systems
Intrusion detection systems are designed to provide security in computer networks, so that if the attacker crosses other security devices, they can detect and prevent the attack process. One of the most essential challenges in designing these systems is the so called curse of dimensionality. Therefore, in order to obtain satisfactory performance in these systems we have to take advantage of app...
متن کاملFast SFFS-Based Algorithm for Feature Selection in Biomedical Datasets
Biomedical datasets usually include a large number of features relative to the number of samples. However, some data dimensions may be less relevant or even irrelevant to the output class. Selection of an optimal subset of features is critical, not only to reduce the processing cost but also to improve the classification results. To this end, this paper presents a hybrid method of filter and wr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010